A Quantitative and Qualitative Evaluation of Sentence Boundary Detection for the Clinical Domain
نویسندگان
چکیده
Sentence boundary detection (SBD) is a critical preprocessing task for many natural language processing (NLP) applications. However, there has been little work on evaluating how well existing methods for SBD perform in the clinical domain. We evaluate five popular off-the-shelf NLP toolkits on the task of SBD in various kinds of text using a diverse set of corpora, including the GENIA corpus of biomedical abstracts, a corpus of clinical notes used in the 2010 i2b2 shared task, and two general-domain corpora (the British National Corpus and Switchboard). We find that, with the exception of the cTAKES system, the toolkits we evaluate perform noticeably worse on clinical text than on general-domain text. We identify and discuss major classes of errors, and suggest directions for future work to improve SBD methods in the clinical domain. We also make the code used for SBD evaluation in this paper available for download at http://github.com/drgriffis/SBD-Evaluation.
منابع مشابه
Development and Evaluation of Real-Time Reverse Transcription Polymerase Chain Reaction Test for Quantitative and Qualitative Recognition of H5 Subtype of Avian Influenza Viruses
Avian influenza viruses (AIV) affect a wide range of birds and mammals, cause severe economic damage to the poultry industry, and pose a serious threat to humans. Highly pathogenic avian influenza viruses (HPAI) H5N1 were first identified in Southeast Asia in 1996 and spread to four continents over the following years. The viruses have caused high mortality in chickens and various bird species ...
متن کاملDevelopment and Evaluation of Real-Time RT-PCR Test for Quantitative and Qualitative Recognition of Current H9N2 Subtype Avian Influenza Viruses in Iran
Avian influenza H9N2 subtype viruses have had a great impact on Iranian industrial poultry production economy since introduction in the country. To approach Rapid and precise identification of this viruses as control measures in poultry industry, a real time probe base assay was developed to directly detect a specific influenza virus of H9N2 subtype -instead of general detection of Influenza A ...
متن کاملبرچسبزنی نقش معنایی جملات فارسی با رویکرد یادگیری مبتنی بر حافظه
Abstract Extracting semantic roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a semantic role labeling system for Persian, using memory-based learning model and standard features. Our proposed system implements a two-phase architecture to first identify...
متن کاملEvaluation of Nursing Management Internship: A Mixed Methods Study
Introduction: Enabling nursing students in clinical skills such as management is only possible through creating enough opportunities for obtaining and practicing such skills in a clinical environment. Therefore, given the importance of this issue and lack of information and backgrounds on the subject, this study aimed to evaluate management internship of nursing students through both qualitativ...
متن کاملOptimum Characteristics of Nursing Students’ Clinical Evaluation: Clinical Nursing Teachers’ Viewpoints in Isfahan University of Medical Sciences
Introduction: Despite importance of evaluating students’ clinical competencies through nursing education program, there is still controversy about optimum characteristics (indices) of clinical evaluation methods and there is no consensus. This study was performed to determine optimum characteristics of clinical competencies’ evaluation and to assess clinical nursing teachers’ viewpoints. Metho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 2016 شماره
صفحات -
تاریخ انتشار 2016